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Goldston & Montgomery [3] showed that under the assumption of the Riemann Hypothesis 
(RH), the Pair Correlation Conjecture of Montgomery [5] is equivalent to the assertion that 



for X e < h < X . In contrast, the Cramer model, which holds that the primes are 
distributed as if the integer n is prime with probability 1/logn, each one independent of 
another, would predict that this expression is ~ hXlogX. If the Cramer model does not 
apply, one is left to speculate about the distribution of ip(x + h) — tp(x). Recently the 
authors [6] used a quantitative form of the Prime &-tuple Hypothesis to give a heuristic 
determination of the moments of ip(x + h) — ip(x) — h, which supports the notion that 
ip(x + h) — ip(x) is approximately normally distributed with mean ~ h and variance ~ 
hlogX/h, as x varies, 1 < x < X, with h in the range X e < h < X 1 ~ e . Odlyzko [7] and 
Forrester & Odlyzko [2] analyzed the distribution of the zeros of the zeta function, and 
found that the data is in close agreement with the Pair Correlation Conjecture. Hence 
one might expect that numerical studies of primes in short intervals would lend support 
to the conjectural relation (1). With this in mind we have calculated the distribution of 
tp(x + h)- tj){x) - hioYQ<x<X = 10 10 when h = 10 5 . In Table 1 below we give the 
numerical values of the moments 



as well as of the normalized moments £4 = fik/fo ■ Since the normal distribution has 
normalized moments ^2fc+i = 0, Ji2k = (2& — 1) ■ (2k — 3) • • • 3 • 1, we see that the normalized 
moments are reasonably close to their anticipated values. The sixth moment is a little large, 
which suggests that large deviations may be rather more common than would otherwise be 
the case. In this regard we note that the largest value of ?p(x + h) — ip(x) — h encountered is 
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5046.08 at x = 9559758537, which is 5.30 times the standard deviation. In 10 5 independent 
samples, which is essentially what we presume to have here, the likelihood of such a large 
deviation occurring is 1 - $(5.3) loB = 0.00577. Here ®(x) = e - * 2 /2 dt is the 

cumulative distribution function of the normal variable with mean and standard deviation 
1. Similarly, the smallest value found is -4920.06 at x = 5116809527. This is -5.17 times 
the standard deviation; such a large negative value would occur, in 10 5 independent samples 
of a normal variable, with probability 1 — $(5.17) 10 = 0.01163. These large deviations 
are somewhat larger than might be expected, but not so much larger, since the maximum 
is larger than 4138 with probability 1/2. Finally, it was found that 

measjz G [0, 10 10 ] : \ip(x + 10 5 ) - ip(x) - 10 5 | > 3000} = 3080882. 

Since the size of this set is less than one fifth the size one would expect with a comparable 
normal variable, the large deviations at this threshhold are less common than would be 
predicted. 

1.0000 
0.0001 
1.0000 
-0.0014 
3.0408 
-0.0319 
15.5288 

Table 1. Moments of tp(x + h)- ip{x) -hfor0<x<X = 10 10 with h = 10 5 . 

In addition to the numerical data described above, the results of sieving were also 
recorded in the form of the cumulative distribution function, and plotted against that of a 
normal variable with the same variance, in Figure 1. The fit to normal is impressive. Note 
that both functions are being graphed on the same coordinate axes. 
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Figure 1. Distribution of ip(x + h) — ip(x) — h (solid) versus normal (dashed). 
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One of the objects of the numerical study was to test whether the variance of ijj(x + h) — 
ip(x) — h is near the value hlogX = 23.02 x 10 5 that would be predicted by the Cramer 
model, or whether it is nearer the to the smaller variance h\ogX/h = 11.51 x 10 5 predicted 
by (1). The big surprise in the data is that the variance 9.07 x 10 5 recorded in Table 1 is 
significantly smaller than even the smaller of these values. To address this discrepancy we 
reconsider the heuristics used to develop (1). Upon expanding, we see that the left hand 
side of (1) is approximately 



A(m)A(n) max(0, h - \m - n\) - h 2 X. 

m<X n<X 



This in turn is approximately 

h 

h A H 2 + 2 J2( h ~ k "> J2 A W A ( n + k) - h 2 X. 

n<X k=l n<X 

By using the Prime Number Theorem with a sharp remainder (we may assume RH), we 
see that the first term above is approximately hXlogX — hX. As for the second term, we 
let E(X, k) be defined by the relation 



A(n)A(n + k) = &(k)X + E(X, k) 



n<X 

where &{k) is the singular series defined by Hardy & Littlewood [4] for the Twin Prime 
Conjecture, 

s(*Hn(i+^)n(i-^). 

p\k 1 p\k yl ' 

If k is odd then &(k) = 0, but if k is even then 

e<*) = cnj£ 

p\k 
p>2 



where 

- 2 n(i-(^) 

p>2 v ^ ' ' 

It is well-known that & (k) is 1 on average, and the estimate with Cesaro weights, 

U l,o 1 



J2(h ~ k)&(k) = -h 2 - — h log h + 0(h) 



k=l 



was used by Montgomery (1971, unpublished) to guess at the Pair Correlation Conjecture. 
We now refine this estimate. 
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Theorem. Let &(k) be defined as above. Then 
ft 

^2(h-k)&(k) = -h 2 - -hlogh + Ah + 0(h 1/2+e ) 

k=l 

where A = (1 — Co — log27r)/2. (Here Co is Euler's constant.) 

When we insert this in the earlier calculation, we come to the conclusion that we should 
expect that 



(2) 



/ 2 X 

/ (ip(x + h) — ip(x) — h) dx = hX log — + BhX + smaller terms 
Jo h 



where B = —Co — log27r = —2.41509 .... For X = 10 10 and h = 10 5 , this more accurate 
main term predicts a second moment of 9.098 x 10 5 , which is much closer to the computed 
value, 9.066 x 10 5 . 

The main barrier to majorizing the 'smaller terms' in (2) lies in estimating the contri- 
bution 



2j2( h ~ k )E(X, k) 



k=l 



of the error terms in the Twin Prime Conjecture. Numerical studies (cf. Brent [1]) suggest 
that E(X,k) < X 1 /^ 6 , and one may presume that this holds uniformly for 1 < k < X. 
Thus the above quantity should be <C h 2 X 1 ' 2 + e , but we actually expect that there is some 
cancellation in the sum itself, so that the above is h 3 / 2+e X 1 / 2+e . Indeed, when all the 
possible sources of error are taken into account, one concludes that the relation (2) may 
hold with an error term that is < h 1/2 X 1/2+e + h 3/2+e X 1/2 . 

Proof of the Theorem. Let s(k) = ri p |fc p >2 Then 

h ft/2 ft/2 

J2( h - k)&(k) = cJ2( h ~ 2k)s(2k) = 2cJ2( h / 2 ~ k)s(k). 

k=l k=l k=l 



We show that 

K 



k=l 

which suffices. Let 



oo 1 

s( S) =Y,»m- = a - 2-r 1 n (i + (p _ p mp ,_ 1} ) 



k=l p>2 

for 5Rs > 1. Then 



sw=cw nj 1+ Fv) =c(a)TW ' 
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say, for 5Rs > 0. Similarly, we note that 

T(s) = C(* + l)(l-2— 1 ) J] { 1+ {p J )p s + i - {p _2)p^ ) = C( 5 + l)(l-2- s " 1 )^), 
say, for 5Rs > —1/2. Clearly, 

when a is a real number, a > 1. We move the integral to the abscissa b, where — 1/2 < b < 0, 
and consider the residues arising from the simple pole in the integrand at s = 1 and the 
double pole at s = 0. Since £(s) ~ l/(s — 1) when s is near 1, and since T(l) = 2/c, 
it follows that the residue at s = 1 is K 2 /c. As for the residue at s = 0, we recall from 
Titchmarsh [8, pp. 16-20] that 

C(s +1) = -+C + 0(\s\), C(0) = -1/2, C'(0) = -\ log27r. 
s I 

Also, U(0) = 2/c and U'(0) = 0. Hence, with a little calculation, we see that the residue 
at s = is 

As for the remaining integral, we note by the functional equation and Stirling's formula 
that \C(b + it)\& V 1/2 ~ b when V < t < 2V. Also, by the Cauchy-Schwarz inequality, 

/2V / p2V \l/2 

|C(6 + l + ii)|df < F 1/2 f y |C(& + 1 + ^)| 2 ^J < 6 F, 

in view of known mean-square estimates of the zeta function (cf. Theorem 7.2 of Titchmarsh 
[8]). Since U(b + it) <C& 1 for 6 > —1/2, it follows that the integral in question is absolutely 
convergent with a value K b+1 . Since we may take b as close to —1/2 as we please, this 
gives the stated result. 

When approached as above, it seems fortuitous that T(l) = U(0) = 2/c and that 
U'(0) = 0. But miracles do not happen by accident, so it seems that there is something 
going on here that remains to be understood. 
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